Early detection system and method for encrypted signals within packet networks

ABSTRACT

A system and method for providing early detection of encrypted signals within a secure connection for voice over Internet protocol (VoIP). The system and method includes a non-complex, in-band, and early encryption detector within the voice path. A transmitter sends out a known pattern. Based upon the received pattern, the receiver decides whether its encryption capabilities match up with those of the transmitter. If the capabilities do not match, then the receiver waits for the signalling message for the correct mode of operation. No packets are utilized until the receiver and transmitter encryption capabilities are matched.

FIELD OF THE INVENTION

The present invention relates generally to providing enhanced security for Internet telephony calls. More particularly, the present invention provides a system and method of early detection of encrypted signals within a secure connection for Voice Over IP (VoIP).

BACKGROUND OF THE INVENTION

Advances within Internet technologies have spawned new mechanisms of data, voice, and video communication including Internet Protocol (IP) telephony, which is a quickly developing field of telecommunications. However, the Internet is faced with two significant obstacles to fast, yet secure, communications. The first obstacle is usable bandwidth. Bandwidth affects the rate at which data can be transferred. The second obstacle pertains to security. The Internet is not a direct point-to-point connection between computers. Rather, it is a network to which computers (or other devices) can connect for the purpose of communicating with one another. As such, there is increased opportunity for eavesdropping on data, voice, or video transmissions over the Internet. One method of enhancing the security of Internet based communications is to encrypt the data being transmitted before sending it out over the network and de-encrypting the data once it is received by the far end device. Voice security is desirable for VoIP connections over an IP network.

The present invention addresses security issues with respect to VoIP telephone calls. Currently, a call signalling channel is secured by using either a Transport Layer Security (TLS), a Secure Sockets Layer (SSL), or an IP Security Protocol (IPSec) on a secure well-known port. These approaches, however, suffer from delays in call setup time, complex handshaking procedures, and significant protocol overhead. Moreover, some VoIP implementations do not prevent signalling information from being viewed by unscrupulous computer hackers on the IP network used for VoIP calls. In some instances, when a SETUP message is sent over the IP network, the calling name and calling number is visible to sniffers or other such tools used on the Internet. To overcome this, voice packets are encrypted at a source and decrypted at the destination in order that a third party cannot eavesdrop on the conversation.

In order to properly advise both endpoints as to how to encrypt the voice packet, media signalling must carry the appropriate security information for negotiation requirements. This signalling must also be passed over a secure channel in order that third parties are not aware of what encryption procedures are being negotiated. Unfortunately, the delay of the signalling path relative to established voice path can result in some undesirable side effects. In FIG. 1, a typical VoIP system including an Internet Protocol Network 100 is shown with a signalling path 15 shown relative to an established voice path 14 between two IP telephony devices 10, 13. A switch 11 is represented in the signalling path 15. Clearly, the shorter path exists in-band. The main concerns in such a VoIP system include noise and voice clipping. Noise occurs when the receiver expects to decipher a real time transport protocol (RTP) packet based on a “best guess”, but receives the packets based on a different cipher, or no cipher before the signalling is sent to the receiver. Voice clipping occurs because the receiver may not play any RTP packets until final negotiation, in which case initial packets would be missed. Typically, the receiver must wait for the final confirmation of the negotiated capabilities of the endpoints before accepting the voice stream packets. On the other hand, if the receiver does not wait for the confirmation, loud “noise” may be played out when the capabilities of the transmitter and receiver do not match.

What is needed is a method that increases security, simplifies VoIP handshaking procedures, and reduces call setup time without adding significant protocol overhead. Further, what is need is a method that addresses both noise and voice clipping concerns.

SUMMARY OF THE INVENTION

The object of the invention is to remedy the drawbacks set out above by proposing a method that inserts an early encryption detector into the voice path.

The present invention includes a system and method whereby the receiver does not have to wait for the final confirmation of the negotiated capabilities of the endpoints before accepting the voice stream packets. This avoids clipped voice (discarded packets) at call setup caused by the signalling path over a VoIP network having a much larger delay than the voice path. The present invention avoids loud “noise” being played out when the capabilities of the transmitter and receiver do not match.

The present inventive system and method includes a non-complex, in-band, early encryption detector within the voice path (RTP stream). The transmitter sends out a known pattern (for example zeros). Based upon the received pattern, the receiver decides whether its encryption capabilities match up with those of the transmitter. If the capabilities do not match, then the receiver waits for the signalling message for the correct mode of operation. No packets are utilized until the receiver and transmitter encryption capabilities are matched.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a typical VoIP system with a signalling path and an established voice path between two IP telephony devices.

FIG. 2 is a flow diagram in accordance with the preferred embodiment of the present invention.

FIG. 3 is a flow diagram in accordance with an alternative embodiment of the present invention.

FIG. 4 is a graphical representation of a voice signal with G.711 showing the application of the method in accordance with the present invention.

FIG. 5 is a graphical representation of a voice signal with G.729 showing the application of the method in accordance with the present invention.

DETAILED DESCRIPTION

The method of the present invention includes early encryption detection during call setup for a call utilizing voice encryption. Such early detection is shown at by way of the flowchart in FIG. 2. It should be understood that, at the start of the call, the first N (where N is an integer) packets are modified at the transmitter with a specific pattern. This is shown at step 150 in FIG. 2. After the Nth packet (step 140), the pattern insertion step 150 would be bypassed. If the packet were encryption enabled (step 160), then the packet would be encrypted at step 170. The inserted pattern is used at the receiver end to indicate matching capabilities and is discussed in further detail below. FIG. 2 also shows the methodology used at the receiver end if the first delivered packet(s) arrive(s) before the signalling message.

Incoming packets from the Internet Protocol Network 100 are received. The method checks for a specific pattern in the first K (where K is an integer) received packets at step 200. The method then determines whether or not the specific pattern is detected within the unencrypted packet at step 201. If the specific pattern is found within the unencrypted packet, then the transmitter is determined to have sent the voice as unencrypted. The cipher is changed to non-decryption mode in step 201 a. Thereafter, all following packets are treated as non-encrypted and played out at step 400.

If the method determines in step 201 that the specific pattern is not detected, the receiver decrypts the packet at step 202 and searches for the pattern again at step 203. If the specific pattern is detected at step 203, then the cipher algorithms at the transmitter and receiver are matches and the cipher is changed to decryption mode at step 203 a. The packets are then decrypted at step 203 b and played out at step 400. If the specific pattern cannot be detected at step 203 (either on the unencrypted or decrypted packet), the receiver cannot make a decision on the mode of encryption of the transmitter. Consequently, all such packets are discarded at step 300 until the appropriate signalling message is received in the form of the specific pattern detection that serves to confirm the mode of operation of the transmitter.

In accordance with the preferred embodiment of the present invention, the specific pattern detected is a string of silence. This pattern of silence depends on the voice CODEC type. For example, such pattern of silence is 0xff in G.711 (mu-law); in G.711 (a-law), such pattern of silence is 0xd5; and, for G.729 such pattern of silence is 0x00. Other CODECs may have different silence patterns. It should be understood to one skilled in the art of audio compression protocols that the G.7xx CODECs (e.g., G.711, G.721, G.722, G.726, G.727, G.728, G.729) is a suite standards developed under the International Telecommunication Union's Telecommunication Standardization Sector (ITU-T) for audio compression and de-compression. These standards are primarily used in telephony. In such telephony, there are two main algorithms defined in the standard, “mu-law” algorithm (used in America) and “a-law” algorithm (used in Europe and the rest of the world).

In FIG. 3, an alternative embodiment is shown according to the present invention. In such alternative embodiment, encryption is always present. As with regard to FIG. 2, the first N packets are modified (step 150) with a specific pattern at the start of the call at the transmitter end. The packet is then encrypted at step 170. After the N^(th) packet (step 140), the pattern insertion step 150 would be bypassed. The inserted pattern is used at the receiver end to indicate matching capabilities and is discussed in further detail below.

Incoming packets from the Internet Protocol Network 100 are received. The method receives the first K (where K is an integer) packets at step 200. The receiver decrypts the first K packets at step 202 and searches for the pattern at step 203. If the specific pattern is detected at step 203, then the packets are played out at step 400. If the specific pattern cannot be detected at step 203, the receiver considers no mode of encryption. Consequently, all such packets are discarded at step 300 until the appropriate in-band signalling message is received in the form of the specific pattern detection that serves to confirm the mode of operation of the transmitter.

In G.711, the chosen length of the silence string is 8 bytes, whereas for G.729 it is a full G.729 frame of 10 bytes. This makes the inventive method compatible with non-compliant receivers. The silence bytes, or frame for G.729, will have minimum impact on voice quality. In the G.729 case, the frame erasure feature may be invoked. For other CODEC types possessing the frame erasure capability, one would also choose a pattern that would invoke packet loss concealment (PLC) algorithms. Such PLC algorithms, also known as frame erasure concealment algorithms, hide transmission losses in an audio system where the input signal is encoded and packetized at a transmitter, sent over a network, and received at a receiver that decodes the packet and plays out the output.

Within the inventive method, the number of packets N that are modified at the start of the call is chosen to be two (N=2). While specifically two is chosen, it should be understood that any number of packets may be modified without straying from the intended scope of the present invention so long as more than one packet is modified to counter potential packet loss at the start of the call. The number of received packets to key on is chosen to be one (K=1) or some number of packets that is less than the N packets modified at the transmitter.

FIGS. 4 and 5 graphically show the effect of the silence patterns on a voice signal. FIG. 4 shows the G.711 case. The dotted line is the signal with the early detection pattern (silence in this case). As can be seen between samples 160 and 170, 8 bytes of samples are overwritten with silence. FIG. 5 shows the G.729 case with the dotted line indicating the decoded G.729 signal with the early detection pattern. No distinctive area exists in the G.729 cases that shows signal error, though 400 samples were needed for complete rippling out of any error. As can be seen from both graphs, the impact on the signal is small. Subjective listening tests by the human ear have also confirmed that the impact on voice quality is minimal, such that the practical impact on a user and the perceived audio is negligible.

Instead of using a silence pattern, it should be readily apparent that other patterns may also be used without straying from the intended scope of the present invention. For example any pattern can be used for G.729, as long as the parity bit indicates frame erasure. The G.729 decoder will invoke the frame erasure feature and ignore all other data in the frame. Different lengths of pattern can be used (8 bytes for G.711 is suitable, though 4 bytes is sufficient). The number of modified frames with the pattern indication may be different from 2. Networks with high packet loss may require more packets.

Other capabilities may be sent in-band from the transmitter to the receiver. Such capabilities may include transmitter characteristics or any other useful information that may be embedded in the VoIP packets.

The above-described embodiments of the present invention are intended to be examples only. Alterations, modifications and variations may be effected to the particular embodiments by those of skill in the art without departing from the scope of the invention, which is defined solely by the claims appended hereto. 

1. A method of providing encryption detection within packet data network telephony calls comprising: at a receiver end of said packet data network, providing in-band early detection of encrypted packets; and decrypting said encrypted packets.
 2. The method as claimed in claim 1, further including receiving, at a receiver end of said packet data network, at least one transmitted packet; where said providing and decrypting steps further include determining whether said transmitted packet includes a predetermined pattern inserted into said transmitted packet at a transmitter end of said packet data network; upon finding said predetermined pattern, changing a cipher to non-decryption mode and delivering said transmitted packet; upon finding no said predetermined pattern, decrypting said transmitted packet with a pre-configured cipher and further determining whether said transmitted packet includes said predetermined pattern; upon finding said predetermined pattern, changing said cipher to decryption mode, decrypting said transmitted packet, and delivering said transmitted packet; and upon finding no said predetermined pattern, discarding said transmitted packet and processing another one of said at least one transmitted packets from said receiving step.
 3. The method of claim 2 wherein said predetermined pattern is a silence pattern.
 4. The method of claim 2 wherein said predetermined pattern is a series of zeros.
 5. An encryption detector for providing early detection of encrypted packets within packet data network telephony calls operating under control of a computer program, said computer program using computer program code comprised of: computer program code operative prior to receipt of a transmitted packet and including: computer program code for receiving, at a receiver end of said packet data network, at least one transmitted packet; and computer program code for determining whether said transmitted packet includes a predetermined pattern inserted into said transmitted packet at a transmitter end of said packet data network; upon finding said predetermined pattern, changing a cipher to non-decryption mode and delivering said transmitted packet; upon finding no said predetermined pattern, decrypting said transmitted packet with a pre-configured cipher and further determining whether said transmitted packet includes said predetermined pattern; upon finding said predetermined pattern, changing said cipher to decryption mode, decrypting said transmitted packet, and delivering said transmitted packet; and upon finding no said predetermined pattern, discarding said transmitted packet and processing another one of said at least one transmitted packets from said receiving step.
 6. The encryption detector of claim 5 wherein said predetermined pattern is a silence pattern.
 7. The encryption detector of claim 5 wherein said predetermined pattern is a series of zeros.
 8. A system of providing encryption detection within packet data network telephony calls comprising: a transmitter for inserting a predetermined pattern into at least two transmitted packets at a transmitter end of said packet data network; a receiver for receiving each of said at least two transmitted packets at a receiver end of said packet data network; said receiver including a means for determining whether one of said transmitted packets includes said predetermined pattern; a means for changing a cipher to non-decryption mode and for delivering said transmitted packet upon finding said predetermined pattern; a means for decrypting said transmitted packet with a pre-configured cipher and for further determining whether said transmitted packet includes said predetermined pattern upon finding no said predetermined pattern; a means for changing said cipher to decryption mode, for decrypting said transmitted packet, and for delivering said transmitted packet upon finding said predetermined pattern; and a means for discarding said transmitted packet and for processing another one of said at least one transmitted packets from said receiving step upon finding no said predetermined pattern.
 9. The system of claim 8 wherein said predetermined pattern is a silence pattern.
 10. The system of claim 8 wherein said predetermined pattern is a series of zeros.
 11. A method of providing encryption detection within packet data network telephony calls comprising: receiving, at a receiver end of said packet data network, at least one encrypted transmitted packet; decrypting said encrypted transmitted packet to form a decrypted packet; determining whether said decrypted packet includes a predetermined pattern inserted into said encrypted transmitted packet at a transmitter end of said packet data network; upon finding said predetermined pattern, decrypting said transmitted packet, and delivering said transmitted packet; and upon finding no said predetermined pattern, discarding said transmitted packet and processing another one of said at least one transmitted packets from said receiving step.
 12. The method of claim 11 wherein said predetermined pattern is a silence pattern.
 13. The method of claim 11 wherein said predetermined pattern is a series of zeros.
 14. An encryption detector for providing early detection of encrypted packets within packet data network telephony calls operating under control of a computer program, said computer program using computer program code comprised of: computer program code operative prior to receipt of a transmitted packet and including: computer program code for receiving, at a receiver end of said packet data network, at least one transmitted packet; computer program code for decrypting said transmitted packet with a pre-configured cipher; computer program code for determining whether said transmitted packet includes a predetermined pattern inserted into said transmitted packet at a transmitter end of said packet data network; upon finding said predetermined pattern, decrypting said transmitted packet, and delivering said transmitted packet; and upon finding no said predetermined pattern, discarding said transmitted packet and processing another one of said at least one transmitted packets from said receiving step.
 15. The encryption detector of claim 14 wherein said predetermined pattern is a silence pattern.
 16. The encryption detector of claim 14 wherein said predetermined pattern is a series of zeros.
 17. A system of providing encryption detection within packet data network telephony calls comprising: a transmitter for inserting a predetermined pattern into at least two transmitted packets at a transmitter end of said packet data network; a receiver for receiving each of said at least two transmitted packets at a receiver end of said packet data network; said receiver including a means for decrypting said transmitted packet with a pre-configured cipher, a means for determining whether one of said transmitted packets includes said predetermined pattern, a means for delivering said transmitted packet upon finding said predetermined pattern; and a means for discarding said transmitted packet and for processing another one of said at least one transmitted packets from said receiving step upon finding no said predetermined pattern.
 18. The system of claim 17 wherein said predetermined pattern is a silence pattern.
 19. The system of claim 17 wherein said predetermined pattern is a series of zeros. 